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The  Information  in  Contingency  Tables  - 
An  Application  of  Information-Theoretic  Concepts 
to  the  Analysis  of  Contingency  Tables 


by 


C.  T.  Ireland  ^ and  S.  Kullback^ 


1 . Introduction 

The  primary  purpose  of  this  paper  is  to  present  an  exposition  of  the 
methodology  underlying  the  analysis  of  the  information  in  contingency 
tables.  We  shall  stress  the  concepts,  techniques,  analyses  and  inferences 
without  entering  into  extensive  technical  statistical  proofs  or  detailed 
references  to  the  bibliography  at  the  end. 

It  is  useful  to  note  that  we  are  concerned  with  an  aspect  of  multi- 
variate (multiple  variates)  analysis  with  particular  application  to 
qualitative  or  categorical  as  well  as  quantitative  variables.  The  basic 
data  we  deal  with  are  counts  in  multiway  cross-classifications  or  multiway 
contingency  tables.  Multiway  contingency  tables,  or  cross-classifications 
of  vectors  of  discrete  random  variables  provide  a useful  approach  to  the 
analysis  of  multivariate  discrete  data. 

As  we  shall  see,  the  analytic  procedures  serve  to  bring  out  various 
interrelationships  among  the  classificatory  variables  in  a multiway 
cross-classification  or  contingency  table  in  many  dimensions.  Classical 
problems  in  the  historical  development  of  the  analysis  of  contingency 
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tablaj  concerned  themselves  with  such  questions  as  the  independence  or 
conditional  independence  of  the  classif icatory  variables,  or  homogeneity 
or  conditional  homogeneity  of  the  clas3if icatory  variables  over  time  or 
space,  for  example.  Such  classical  problems  turn  out  to  be  special  cases 
of  the  techniques  we  shall  discuss.  These  techniques  result  in  analyses 
which  are  essentially  regression  type  analyses.  As  such  they  enable  us 
to  determine  the  relationship  of  one  or  more  "dependent"  qualitative  or 
categorical  variables  of  interest  on  a set  of  "independent"  classificatory 
variables  as  well  as  the  relative  effects  of  changes  in  the  "independent" 
variables  on  the  "dependent"  variables.  In  particular  such  problems  as 
the  determination  of  possible  factors  and  measures  of  their  effect  and 
interactions  in  the  representation  of  the  logits  of  one  or  more 
dichotomous  variables  lend  themselves  to  the  analysis  we  shall 
examine . 

The  methodology  is  based  on  the  Principle  of  Minimum  Discrimination 
Information  Estimation,  associated  statistics  and  Analyses  of  Information. 
General  computer  programs  are  available  to  provide  the  data  for  the 
inferences. 

2.  Contingency  Tables 

We  assume  that  the  reader  has  some  familiarity  with  cross- 
classif icationi  in  the  form  of  contingency  tables.  We  use  a 
slightly  modified  conventional  notation.  For  example,  for  a 
four-way  contingency  table,  that  is,  one  with  four  classifications 
or  variables,  each  of  several  categories,  not  necessarily  the 
same  in  number,  we  represent  the  observed  number  of  occurrences 
in  the  (ijkl)  cell  of  the  contingency  table  by  x(ijkl) , where 
the  indices  i,j,k,l,  range  over  the  respective  categories  of 
tne  variables.  The  corresponding  probabilities  are  represented 
DY  p(ijkl) . Summation  over  one  or  more  indices,  resulting  in 
various  marginal  distributions  or  marginals,  is  indicated  by  a 
dot  or  dots , thus 

I x(ijkl)  *»  x('jkl),  EE  x(ijkl)  * x(i*k*),  etc., 

i jl 

witn  a similar  notation  for  the  probabilities. 
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Wo  shall  denote  estimates  under  various  hypotheses  or 
models  by  x*(ijkl),  where  values  of  the  subscript  o.  will 
range  over  the  hypotheses  or  models. 

An  example  of  a 2x2  two-way  contingency  table  is  shown 
in  Taole  2.1. 


Table  2 . 1 
x(ij) 


j - 1 

j = 2 

i = 1 

x(ll)  j 

x ( 12 ) 

x ( 1 • ) 

i = 2 

x(21) 

i 

x (22) 

x (2  • ) 

i 

1 

I x(-i) 

x ( • 2) 

x ( • * ) 

The  estimated  two-way  table  under  the  hypothesis  or  model 
r-  of  independence  is  shown  in  Table  2.2. 


Table  2.2 
x*(ij) 


j = 1 

i = 1 

1 

x(  1* ) x ( *1)  /n 

x (1  • ) x ( • 2)  /n  1 

x ( 1 • ) 

1 = 2 

x (2  • ) x ( • 1) /n 

x(2  • ) x ( • 2)  /n 

x (2  • ) 

x ( • 1 ) 

x(  *2) 

n 
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A co.v.non  statistical  measure  of  the  association  or  interaction 
between  the  variables  of  a two-way  2x2  contingency  table  in  the  cross- 
product  ratio,  or  its  logarithm.  The  cross-product  ratio  is  defined  by 

n n x(ll)x(22) 

1 U x(12)x(21)  » 


though  we  shall  be  more  concerned  with  its  logarithm 

(2  21  In 

(2-2)  1 x(12)x(21)  • 

We  shall  use  natural  logarithms,  that  is,  logarithms  to  the  base  e , 
rather  than  common  logarithms  to  the  base  10,  because  of  the  nature  of  the 
underlying  mathematical  statistical  theory.  Note  that  with  the  estimate 
for  independence,  or  no  association,  the  logarithm  of  the  cross-product 
ratio  is  zero. 

* * x(l-)x(-l)  x(2* )x(* 2) 

= £n  1 = 0 . 


n 


n 


n ln  x (ll)x  (22)  _ y 

1 J x*(l2)x* (21)  x(l:)x(l2)  x(2-)x(-l) 


n 


n 


The  logarithm  of  the  cross-product  ratio  is  positive  if  the  odds  satisfy 
the  inequalities 

x(ll)  x(12)  _ x(ll)  x(21) 

x ( 2 L ) x(22)  x(12)  x(22)  ’ 


since  then  we  get  for  the  log-odds 


tn  x ( 1 1 ) x ( 2 2 ) ln  x(ll) 

x(12)x(21)  " x (21) 


*■ fBH  • » 


r„  smi 

X (12) 


in  > o 

"(22) 


The  logarithm  of  the  cross-product  ratio  is  negative  if  the  odds  satisfy 
the  inequalities 

x(ll)  x(12)  x_(J,l)  . x ( 2 1 ) 

x(21>  x ( 2 2 ) °r  x ( ] 2 ) ' x (22)  ’ 

since  then  ;o  gat  for  the  log-odds 
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-t'n  ••-(J  \\  I'.'-’?  In  ':n[)  . ; a ‘ • 1 

. I i . ( : j ) ( > i ' • 


*(  ’) ) 

>.  r; : i 
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in  • • ■ . £n  > . 

• r ’ ■ ' ■ 


o . 


I'. SO  lo  i f'l,::  I.;:  j C TO  ■...-•prod  let  l.i'io  'll-;  V it  us-;  t COW  to  +“*  . 

Later  i i eon  ; id  jc  •v-,!ur  foi  ■ ng  l:.o.  t;  ig.iif  ic.moe  of  the 

devint  Lon  the  1 r I turn  ■>:  tiu-  c.  ; ; -pro  it"_L  ratio  from  zero,  the  vaLue 
cotri1.  ;'o  vJ  i;;  lo  .o  u .;;•>■  in  Lon  or  a > i .V  • ra.  ' i m . 

For  tae  three -way  2x2x2  contingency  table  in  addition  to  the 
classic  types  ol  independence,  interaction  or  association,  there 
arises  an  additional  one  important  historically  and  practically. 
This  is  known  as  no  tnree-factor 


or  no  second-order  interaction.  l!o  three-factor  or  no  second-order 
interaction  implies  that  the  logarithm  at  the  association  measured  by  the 
cross-product  ratio  for  any  two  of  the  variables  is  the  sane  for  all  the 
values  of  the  third  variable,  that  is,  there  is  no  second-order  interaction 


if 


(2.4) 


5 x('Il)x(221) 

x(i2i)x(21I) 

x(l ll)x(2L2) 

“ x (112);: (211) 

, X m 1 >>.-(122) 
“ n r.  (112):-.  (121) 


on  x(112)x(222) 
n x(122)x(2i2)  * 

f x(121)x(222) 

'n  x (1 2 2) x (221)  ' 

9 x(21l)x(222) 

"l  x (212)  (221  ) ’ 


i.  j 

i,  k 

.1,  k * 


One  is  concerned  with  the  possible  hypothesis  or  model  of  no 
second-order  interaction  uhen  none  of  the  other  types  o'  independence  are 
found.  ’-/o'. over , in  this  case,  the  corresponding  as tin a re  cannot  he  ex- 
pressed explicitly  in  terms  of  observed  marginals  although  the  estimate 
is  constrained  to  have  the  saute  tvo-vay  marginals  an  the  observed  table. 
Straight  on  ird  iterative  procedures  exist  to  determine  the  estimate 
under  the  hypothesis  or  model  of  no  second-order  interaction.  For  the 
general  three-way  rxsxt  contingency  table  there  are  of  course  m.-un  more 
relation;  among  the  log  cross-product  ratios  like  (2.4)  which  must  be 
sat  i. if  led,  but  the  iterative  procedures  to  Interns!  n e the  os’:)  .-ate  extend 
to  :hn  uim  il  ci  .n  *;  i th  no  dill  lenity. 
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For  four-way  and  higher  order  contingency  tables  the  problem  of 
presentation  of  the  data  increases,  as  do  the  variety  and  number  of  rues- 
tions  about  relationships  of  possible  interest  and  varieties  of  interaction. 
The  basic  ideas,  concepts,  notation  and  terminology  we  have  mentioned  for 
the  two-  and  three-way  contingency  tables  extend  to  the  more  general 

cases  as  we  consider  the  methodology.  For  some  additional  prefatory 
remarks  see  Ku  et  al  (1971) . 

3.  Discrimination  Information 

To  make  the  discussion  more  specific  and  with  no  essential  restric- 
tion on  the  generality,  we  shall  present  it  in  terms  of  the  analysis  of 
four-way  contingency  tables.  Let  us  consider  the  collection  of  four-way 
contingency  tables  RxSxTxU  of  dimension  rxsxtxu  . For  convenience  let 
us  denote  the  aggregate  of  all  cell  identifications  by  ft  with  individual 
cells  identified  by  u)  so  that  the  generic  variable  is  o»  - (i,j,k,Jl)  , 

i - 1, . . . ,r , j - l,...,s,  k - l,...,t,  i - 1 u . Suppose  there  are 

two  probability  distributions  or  contingency  tables  (we  shall  use  these 
terms  interchangeably)  defined  over  the  space  0 , say  p(u),  ir(tj), 

E p(iu)  » 1,  E tt(oj)  = 1 . The  discrimination  information  is  defined  by 

a 0 

(3.1)  I(p:ff)  - Z p(w)  in  -2^  . 

The  basis  for  this  definition,  its  properties,  and  relation  to  other 
definitions  of  information  measures  will  not  be  considered  in  detail  in 
this  exposition.  For  the  particular  types  of  application  to  which  wa 
shall  restrict  this  exposition  the  fl-distribution,  tt(-j)  , in  tha  definition 
(3.1)  according  to  the  problem  of  interest  may  either  be  specified,  or  it 
may  be  an  estimated  distribution.  The  p-distribution,  p(u)  , in  th  ' 
definition  (3.1)  ranges  over  or  is  a member  of  a family  of  distributions 
of  interest. 

Of  tha  various  properties  of  t(p:iO  we  mention  in  particular  tha 
fact  that  I(p:it)  > 0 and  * 0 if  and  only  if  p(-)  " ff('j)  . 


4.  Mi  a in 

ui’!  U i.;or  Lr: i 

iia  ?-  io  l 

! n [ o i mat  i u".  i 

■?.;  t i i ia  t i ■ -c 

* f , 

ny  problems 

i n ‘.lie 

analysis  or 

coni  i a 

t u> ie - 

. i.'.iy  bo  charac- 

L c> :*  I :*  - . ! 

est i mat i - : 

a dis 

t dilution  o- 

cant i ago a • v 

t abl  > 

subjoct  to 

c c*  c l a i p.  r 

c ■ : r a int  s a 

nd  then 

comparing  t’ 

te  estimate.! 

tabl* 

wi.t’n  an  observed 

Labia  to 

dote  mine  \: 

aether 

tiie  observed 

table  satis 

Cies  a 

null  hypothesis 

or  model 

implied  by 

the  res 

traints.  In 

accordance 

with  the  principle  of 

nir.ir.ma  d i scr iriinatlon  information  eist  it.uiti.on  we  determine  that  mainour  of 
the  collection  or  family  of  p-distribulions  satisfying  the  restraints 
which  minimizes  the  discrimination  information  over  all  members 

of  the  family  of  pertinent  p-dist ributions . We  denote  the  minimum  dis- 

J. 

crimination  information  estimate  by  p (to)  so  that 
(4.1)  I (p  : « ) = Z p (to)  £n  = min  I(p:t)  . 

Unless  otherwise  stated,  the  summation  is  over  9.  which  will  be  omitted. 

In  a wide  class  of  problems  which  can  be  characterized  as  "smoothing'’ 
or  fitting  an  observed  contingency  table  the  restraints  specify  that  the 
estimated  distribution  or  contingency  cable  have  soma  set  of  marginals 
which  are  the  same  as  those  of  an  observed  contingency  table.  In  such 
cases  ~(m)  is  taken  to  be  either  the  uniform  distribution.  w(ijkZ)  ** 

1/rstu  or  a distribution  already  estimated  subject  to  restraints  contained 
in  end  implied  by  the  restraints  under  examination.  The  latter  case 
includes  the  classical  hypotheses  of  independence,  conditional  independence, 
homogeneity,  conditional  homogeneity  and  interaction,  all  of  which  can  be 
considered  as  instances  of  generalized  independence  and  will  be  considered 
in  some  detail  in  th  • s paper.  By  generalized  fad  : wer.Jance  is  meant  the 
fact  that  the  estimates  may  be  expressed  as  a procure  of  factors  which  are 
functions  of  appropriate  marginals.  See  Ku  et  al  (1971). 

3 . ’.iln.Li : :i:.:  biscriui  nation  I r l’o  rmat  ? on  Statistic. 

To  cunt  'whether  an  observed  contingency  table  is  consistent  with 
the  nail  hypothesis  or  nodal  as  represented  by  the  -in  ini' re  d Lscrimi  nation 
information  estimate  rompu*  v.  u measure!  of  the  • levin,  ion  between  the 
•served  u i st  r Last  m m end  the  apprope  ; w.e  ustiw  >to  jy  tie.  mini-,  i"  disc.*?::'.- 
in  it  ).i  ii’.for.v  it  ion  statistic.  For  net.:*  ionnl  ••on  .•  cninnea  end  inter 
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< > > " ! ; . \ \ t C l > « 1 V \J  1 ‘ 

! ■ "H  let  us  bare  o ' 

no 

. • . : : r 1 .■■itin 

i j i m> It 

i.  :i  t . 1 1 iK1'  a: 

■•••  • by  ..■(■■)  ns  ( 

) . 

. i : ' i i n 

."  or’ 

Sift  i a c !.  i.-.s  of  p r.i 

1 ! s.'.s  , 1 no  ; is,  ■ tii 

trie 

res  : : n 1 I 1 i 

ny  a . -t 

of  cnror'.ed  . mrv,  in.  a i. 

a \ those  of  a gen-'ra1 

i / .■ 

■ i iii  : -p-.  a . ■ 1 1 so  hypo 

; hes is)  , 

■ ho  "ii  a Li  in  a ciiscri.r.ii 

nation  i ufc mat  ion  ( ■ 

. . 

i.)  statistic  is 

(3.1) 

2 - 2E  . ,) 

i.n 

.r-ijl 

**(■■■) 

1 rich  is  asymptotic  > 

1. 1 y lildtribut^J  as  a 

2 

X 

with  appropriate 

degrees  of 

f ret  Jon  under  the  nt: 

Li.  iiy  do  thesis. 

Tiir  statistic 

is:  (5.1)  is  also  minus 

f./icj  the  logarithm 

. of  tiie 

c. ! • : ; s i - ! Lkel  ihood  rntio  statistic  but  this  is  not  necessarily  true  for 
other  kinds  of  applications  of  the  general  theory. 


6.  I 'l  ininu.Ti  Disc  rim.  in.'.  tion_  Ju  i Lomati  or  Theorem 

We  u ,)u  p ■ t? :» ■ • 1 1 1 theorer.  ui'.icu  is  the  basis  for  the  principle  of 

minimum  disc  rim Lnat  ion  information  estimation  and  its  applications.  We 
.si'. nil  present  it  in  a form  reLatcd  to  the  content  of  this  discussion  on 
the  analysis  of  cunt  in;;  nicy  tables. 

Let  us  consider  the  -.pace  S2  mentioned  in  Section  3 and  the  dis- 
cri  aviation  information  i at  i.  educed  in  (3.1).  Suppose  sow,  for  e .cample, 
that  there  are  three  linearly  independent  statistics  of  interest  defined 
o.'er  the  spare  0 , 

(•>•))  i,  i a)  , , '!..(•)  • 

I /-  J 


>a>  J- a . • rr.ii  n 
La.  a ;t*  . , i on 

;-3.2) 


Liie  v.il'ie  ol  p(w)  v.Viirh  min im.ia.e s 


i (p 


L p(m) 


•,n 


P(-) 

r(.) 


(i  Lscrir.ir.  i.  ir:  i 


t,  -'ily  of  p-ai  .:  t i but  ions  v.'LL-.'  saci1"  Les  the  restraints 
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(6.3) 


I T1(-.i)p(<0  - 6* 

1 T7(ui)p(w)  - 8 2 
s T-jCuOpCj)  = ©5 

k k A 

where  , ?2  , ^3  are  specif led  values,  and  if(i;)  is  a fixed  distri- 

bution. 


If  ti(oi)  satisfies  the  restraints  (6.3),  then  of  course  the 

minimum  value  of  I(p:r)  is  zero  and  the  minimizing  distribution  is 
☆ 

P (ui)  = . More  generally,  the  minimum  discrimination  information 

theorem  states  that  the  minimizing  distribution  is  given  by 


(6.4) 

where 


* 

P (w)  = 


exp  (ijT1(w)  + t2T2(u)  + T3T3(<j))Ti  . u) 
M(t]l,t2,t  ) 


(6.5)  M(t1,t2,t3)  = Z exp  (TjT^io)  + t7T2(u)  + t3T3(u) ) tt(u) 

is  a normalizing  factor  so  that  Z p*(m)  = 1 , and  the  r's  are  para- 
meters which  technically  are  in  essence  undetermined  Lagrange  multipliers 

* * * 

whose  values  are  defined  in  terms  of  ‘ ®2  ’ 


°* = ^ tn  M(wV 


= (E  exp  (t1T1(o>)  + t2r7(u))  + T3T3(-i))T1(m)  |T(m))/M(T1,T2,T3) 
~ Z T3(w)p  (m) 


0*  = *n  M(T1,t2,t3) 


(6.6) 


(;:  exp  (t1T1(w)  + t2T2(u)  + T3T3(w))T2(u)r(m))/>l(T1,t2,T3) 
k 

E T2(w)p  (u) 


°3  M(ll’T2’l3) 


= (;:  exp(tjTj(w)  + t2T2(w)  + T3T3(io))T3(w)Ti(u))/M(T3,t2,T3) 
= Z T3(u)p  (w)  . 

We  can  now  state  a number  of  consequences  of  the  preceding. 
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Wo  nut  o first  tit  it  p (w)  is  a r of  an  t\pinriiL  La  1 family  of 

dist  ritnir  i-'ns  generated  by  ■■  (u»)  and  a-;  such  lias  Lin  desirable  statistical 
propoi  ! i or;  of  members  of  an  exponentia  I i s.illy  whi.-h  inc  ! ode  all  the 
common  and  classic  distributions.  We  nay  ..Iso  write,  f b .4)  as 


(6./) 


, P (o) 
£n  


:.(£  * W’eW  + V‘i(“>  + l2T2<u)  + «,V‘> 

L + + T'/r2^w)  + ^ 3T3 


with  L _ 
(6.7)  for 

v a r i ab  1 e s 
tant  role 


-in  ;■;(  i , , i ) . The  regression  or  log-linear  expression  in 

in  ( p" (to)  / tt (as)  ) with  T^a)  . T2(w)  , T (u)  as  the  explanatory 

and  t . , t ,,  , x _ as  the  regression  coefficients  plays  an  Lr.tpcr- 

in  the  analysis  wc  shall  consider. 


We  note  next  that  the  minimum  value  of  the  discrimination  information 
(6.2)  is 

(6.6)  I(p  :■  / = r101  + '2“2  + r 3 -3  “ 

* 

•who re  the  ’»  are  defined  in  (6.3)  and  the  t’s  are  determined  to 
satisfy  (6.6).  Using  the  value  in  (6.7)  it  may  be  shown  that  if  p(ui)  is 
any  member  of  the  family  of  distributions  satisfying  (6.3),  then 

(6.9)  I ( p :r ) = I(p  :-)  -I-  l(p:p  ) . 

The  pylhagorean  type  property  (6.9)  plays  an  important  role  in  the  analysis 
of  ini ormation  tables. 


7.  Computat  ion  '.  1 I*  raced 'lies 

An  "experiment"  has  been  designed  and  observations  made  resulting 
in  a mu  1 1 i -d i mens  ion a 1 contingency  table  with  the  desired  classifications 
and  categories.  All  the  information  the  analyst  hopes  to  obtain  from  the 
"experiment"  is  contained  in  the  contingency  tabic.  In  the  procop s of 
analysis,  the  aim  is  to  fit  the  observed  table  by  a minimal  or  parsimonious 
number  of  pur. .meters  depending  on  some  or  all  of  the  marginals,  that  is. 
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to  £ j ju!  >ut.  bow  much  of  this  Lola!  inform  it  ion  is  contaf :u*d  in  a summary 
consisting  ol  sits  of  marginals.  indeed,  tlie  relationship  between  the 
concept  of  null!  petulance  Jr  assoc  i at  Lott  and  interaction  in  contingency 
tables  and  the  role  the.  marginals  play  is  evidenced  in  the  historical 
developments  in  the  extensive  literature  on  the  analysis  of  contingency 
tables.  Thus,  the  n"'s  in  the  preceding  discussion  will  be  the  mar- 
ginals of.  interest.  Set:  Ku  et  al  (1371). 


7.1.  The  _T_(w) I'an-.-t  ions.  The  T(u)  functions  for  the  RxSxTxll 

table  turn  out  to  be  a basic  set  of  simple  functions  and  their  various 
products.  Thu:;,  for  example,  the  T (•>’ ) function  associated  with  the 
one-way  marginal,  p ( 2 . . . ) is 

(7.1)  T^ijki)  = 1 for  i = 2 , any  j,k,2. 

= 0 otherwise 

since 

(7.2)  1 p(ijkt)  T!J(ijk£)  p(2. . .)  . 


Similarly  the  T(oj)  function  associated  with  the  one-way  marginal  p(..3.)  , 
for  example,  is 


(7.3) 

'i^(ijkf-)  - 1 for  k = 

= 0 othervi.se 

since 

(7.4) 

>■  P ( i j ’■  ■ ) '"-j(  ijk' ) - p( 

Thus  for  the 

rxsxtxu 

table  there 

are 

(r-1) 

1 i nun  r.l  y 

independent 

fund  ions 

ft 

a 1 , . . . 

. , r-1 

C--1 ) 

1 i near J y 

i ill  '.'pen.  lent 

funct ions 

if(ijki), 

3 = 1... 

. ,s-l 

(7.3) 

( t 1 ) 

1 i near  Ly 

i nib-pendent 

1 unct Lons 

Ty(ijkH), 

li 

.,t-l 

(u-1) 

linear 1 y 

i ndi'pendent 

funct ions 

i6u(ijke) , 

6 = 1,.. 

. , u-1 

since,  for  ex 

amp  1.0 , 

r R 

>1  1 Ta(ijk")  = rstu  . 

u ••  1 
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U\  h-ivi-  asMtrarf  !y  o <c  li.Jed  t ho  f'uuct  < : <uiri:.|i'.  ndi  ng  to  u r-  r, 

8 - s , Y : t > >1  - u a-;  a natter  » i coiivaii-Pi-c.  UV  could  have  selected 
it  = 1,  8 ••  1,  y ~ ' , '■  - 1 or  any  other  net  of  values. 

The  T(<u)  function  assoc  kited  with  the  two-way  marginal  p(12..) 

1\  S R 

say,  is  T.  (i  jit?.)  T\(  Ijkl)  since  from  the  definition  of  T.j  (ijk/)  and 

"u  ( i j k '. ) it  may  h e s ten  that 

(7.(>)  l^(ijk-)  Ty(ijki)  -*  i for  i - ],  j = 2,  any  k,2. 

= 0 otherwise 


and 


(7.7)  l pCijkO  Tj(ijki)  T 2 ( i i k Y ) = p(12..)  . 


R S RS 

for  convenience  we  shall  write  T (ijkf)  Tft(ijk£.)  = T (i-jkS.)  , etc.  Thus 

a h at. 

the  T(vj)  function  associated  with  any  two-way  marginal  is  a product  of 
two  appropriate  functions  of  the  set  (7.5). 


Similarly  the  T(w)  function  associated  with  any  three-way  marginal 
will  be  a product  of  turee  of  the  appropriate  functions  of  the  set  (7.5), 
for  example, 

(7.8)  Z p(ijk.)  To  ( ijk?.)  T^(ijki)  T-J(ijk?.)  = p(2.13)  . 


R S T RST 

For  convenience  we  shall  write  T ( ijk?.)  Tl(ijki)  Ty(i jk£)  = T 1,  /ijk-1)  , 

e t c . 

Similarly  t he  T( ..)  function  associated  with  any  four-way  marginal 
will  be  a product  of  four  of  the  appropriate  functions  of  the  set  (7.5), 
for  example, 

(7.9)  1 p(ijki)  T^fiJkJ)  T^(ijkf)  Tj(  ijki)  T^ijk?)  = p(2112)  . 


For  convenience  we  shall  write 


, ,RST(J.  . 


•r^(ijkZ)  T^(ijkf)  T^ ( ijki.)  T^(ijkf)  - 
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We  note  that,  t here  are  a total  of 


(7.10) 


(r-1)  ‘ (s-l)  + (t  1)  + (u-1) 

(r-1) (s-l)  + (r-1 ) (l-1)  + (r-1) (u-1)  + (s-l)(t-l) 

+ (s-l) (u-1)  + (t- 1) (u-1) 

( r- L) (s-l) (t-I ) 1 (r-1) (s-l) ( u-1 ) 4 (r-1) (t-l) (u-1) 
+ (&-l)(t-l)(u-l) 


(r-1) (s-l) f t-l) (u-1)  , 


respectively,  of  the  simple  linearly  Independent  functions  and  their 
products  two , three,  four  at  a time.  It  may  he  verified  that 


(7.11)  rstu  - 1 = N = N.  + N„  + N_  + K.  . 

i i.  i H 

These  values  of  the  number  of  T(to)  functions  (or  associated  tau  para- 
meters) appear  as  appropriate  degrees  of  freedom  in  the  analysis  of 
information  tables. 


7.2.  The  Estimated p (u)  Values.  In  the  usual  least  squares 

regression  analysis  procedure,  one  first  computes  the  regression  coeffi- 
cients and  then  gets  the.  values  of  the  estimates.  In  the  methodology  we 
use  we  reverse  the  procedure.  Instead  of  Lrying  to  obtain  the  values  of 
the  t's  from  (6.6)  (which  is  possible)  we  shall  first  obtain  the  values 

-JL 

of  the  estimates  p (m)  by  a straightforward  convergent  iterative 
procedure  and  then  derive  the  vaLues  of  the  t 1 s from  (6.7).  We  shall 
not  discuss  the  details  of  the  iteration  here,  as  they  are  in  the  computer 
program  and  have  been  described  elsewhere.  The  iteration  may  be  described 
as  successively  cycling  through  adjustments  of  the  marginals  of  interest 
starting  with  the  .;(v)  distribution  until  a desired  accuracy  of  agree- 
ment between  the  set  of  observed  marginals  of  interest  and  the  computed 
marginals  has  been  attained.  See  Ku  et  al  (1971)  . 


7.3.  The  x Values  or  Interaction  Parameters.  From  the  definitions 

of  the  T(:j)  functions  in  Section  7.1  it  is  clear  that  they  take  on  only 
the  values  0 or  1 for  each  value  of  w . From  the  nature  of  the  T(w) 
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w 


,1 


f 


function:;  tin  sot  o!  regression  or  log-  linear  pi.it  Lous  (o.7)  will  have 

witn  a singly  i value  which  can  he  date t mined.  Then  there,  will  be. 
a set  with,  one  additional  unknown  value  and  some  of  the  x's  already 
determined.  These  imi  uninown  x values  can  be  then  determined . This 
process  of  successive  evaluation  is  carried  on  until  all  the  values  of 
i are  determined.  They  are  also  available  as  output  of  a general  roci- 
; uter  program. 


8 . G r . .phie  Kupresentar ion 

A useful  graphic  representation  of  the  log-linear  regression  (6.7) 
is  given  in  Figure  8.1  for  a 2>:2x2x2  contingency  table.  This  is  the 
analogue  of  the  design  matrix  in  normal  regression  theory.  The  blank 
spaces  in  Figure  8.1  represent  zero  values.  The  (ijki) -columns  are  the 
cell  identifications  in  the  same  lexographic  order  as  the  cell  entries 
for  the  estimates  in  the  computer  output.  Column  1 corresponds  to  L 
which  is  essentially  a normalizing  factor.  Each  of  the  columns  2 to  16 
represents  tire  corresponding  values  of  the  T(u»)  functions  , columns  2 
to  5 those  for  the  one-way  marginals,  columns  6 to  11  those  for  the  two- 
way  marginals,  columns  12  to  15  those  for  the  three-way  marginals,  and 
column  16  thaL  for  the  four-way  marginal.  For  convenience  the  columns 
ore  also  arranged  in  lexographic  order.  The  tau  parameter  associated 
with  the  T(v)  function  is  given  at  the  head  of  the  column.  The  full 
r-.:>res$cnrut  ion  with  all  the  columns  of  Figure  8.1  generates  the  observed 
values.  Thus  the  rows  represent 


(8.1) 


>.n 


p(ijkc) 
t(i  jkf.) 


:n4USi 

n~(  1JK-. ) 


L + i^T^(ijkk)  + . 
” + 1lllTllL(i^U) 


• + 'n'n'1)111* 


, i jki  .ijkSl . . ..  , . 
+ ^lll’mi*1^ 


where  r(ijkJ.)  in  the  2x2x2x2  case  is  1/2x2x2x2  and  the  numerical 
values  of  I,  and  the  tans  depend  on  the  observed  values  x(ijki)  . The 
design  matrix  corresponding  to  an  estimate  uses  only  those  columns  asso- 
ciated with  the  marginals  explicit  and  implied  in  the  fitting  process. 

This  is  a reflection  of  the  fact,  that  higher  order  marginals  imply  certain 
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1 j k *■ 

1111 
1112 
112  1 
112  2 
12  11 
12  12 
12  2 1 

12  2 2 

2 111 

2 112 
2 12  1 

2 12  2 


2 2 11 
2 2 12 
2 2 2 1 
2 2 2 2 


Figure  8.1.  Graphic  representation. 


lower  order  marginals,  for  example,  the  two-way  marginal  x(ij..)  implies, 
by  summation  over  i and  j , the  one-way  marginals  x(.j..)  , x(i...)  , 
and  the  total  n = x(....)  . Thus  the  estimate  based  on  fitting  the  one-way 
marginals  will  use  only  columns  1-5.  The  values  of  L and  the  taus  for  this 
estimate  will  be  different  from  those  for  x(ijk2)  and  depend  on  the  esti- 
mate x*(i jkl)  . Thus  if  we  denote  the  estimate  based  on  fitting  the  one-way 
1 * 

marginals  as  x^(ijkl)  , the  representation  in  Figure  8.1  implies 


(8.2) 


xj(llll) 


x*(1112) 


x. (2222) 


L + t*  + + t* 


i j k 

L + T^  + T^  + T^ 


ill'.  . .i  i. . ..  u ;>  .i:.„  i ■ . ill  i . i..  i .. . i ...  i i . . i ■ . . . . i i 

^ » .lit.,  i i ....  t . .. i i .»  i I . . . I ...r  i i I . . . t ...  . . i 

I . j .1  i tu  ; ....  . ! ,i  . till.,  i i > ...  I .III  I . I I M . ; . , . 


i . u . . . ..  . . . . . . 

i . . j * V.  : 

.....  I . t i . . t 


l i i.|. ; l . . . . i .1  . ! ■ j ...  .1- ! tn.  .t  ..ill.'..  . I ill.  i 


:•  ...  . •.  . I » il.i.-.;  ! . . :.  . 

it  it  . . I,  ...  .U  il  it  > . I t jl.  ■ i . I'..  I I III  tol  l.’.lli'  I I I i I . . II 

...  u . i>.  I - ; lie  u . . . ..  1 1 ; n i . la  i . . t i , u . ■ H I j « *(.  I i«.. 


-.,(11111 


I \ l v i • i I.  1 V ) !•  j > y. 

Ill  I i I I I i I : I . . t I I I 1 i 1 1 

i i i l li  n ll  J i J i j ] 


i , j t , i j , iU  il. 

‘ !1  1 I 1 1 1 'll + 'll  ■'  ‘11 


- I 


the  i I'iiiii-.  iitvit.  (mi  lui  Lliii  iiiiiliitiii  il  isi  i I lint  ion  coi rortpouds  to  < o I uinn  1 only. 
Nolo  that  lit  ....  i f<l. in.  <•  with  t/.lti)  and  (/.ll; 


N - I t 1 III 


- 4 (lOlitliiti:.  '/  to  *>) 

N ( I I 1 I I I 1 I 1 t I - ()  (ml  in.  u:.  0 to  1 i) 

N »*  I + I > I + I h (colli:  Mia  II’  to  l‘l) 

N,  - 1 J (iiilu.ii.1  hi; 

-t 

tt  •*  ll.  1 - 1 i “•  'i  f 0 »’  /(  I-  1 . 


An  a.  ■ • I •;!  i«t  1 1 ion 


A 1 1 .’.i>i::;h  tha  |.  i’.'i  oil  1 tn>  il  i mms'i  I mi  has  nr  rl.i.m  n in  ii'nis  of 
pi  i.l'.ih  i'll  i ...I  i i.ii  pi  oU.ih  1 l i i i I’M  or  rel.it  l vo  i reputne  lea  ( in  prau  lot 
i r !ii.  h vii  t o in  ii  1 i;,,r.-  .iitivoii  I ••'ll  nut  to  .I'ylik*  rv.- 1 / 1 li  1 nft  by  ii  , the  rota) 

nattitiiM  ol  ui.  ui'iii.  i'  mil  iloil  with  oh.ii  vod  or  >-.i  (.m  il  od  oocn  r ronceu , dial. 

* 

,s,  with  ii.TiijU/)  Ii/rntn  , x(ljki)  , x ( i . . . ) , >;(.jh.)  , X (J|lf) 

up*  ( i )l  •’ ) , etc.  Tin’  .mi  I yit  i ■;  ol  lui  nrnt.it  Ion  in  hom'd  on  die  iiiiidaiiHmC.il 

ii'l.itinn  (n  .0)  for  tho  ii,  i n i iiiii.n  il  l si*  r i in  i nil  ! tin  1 nfoi  nation  :» t at  is  t i <"> . S|>e- 

i'  i f i i a l t v li  ill1  0">  \ (m)  l.'i  thi“  minimum  d i i'  1 n ( tia  t inn  i llfnt  mat  iotl 

a ’ a * 

c:.  t im.it  i'  air  rt'siinml  i nr  to  a ;:i*t  ll  ol  t’ivon  warp.  i u.il  s and  x.  (m)  Ik  the 
' a l> 
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«»  BDC  doM  oo* 

~ Utf  lagibU  reproduction 


nininun  di:uu  i...!  . ■ L i ; ' into  -tv:  <:  ion  est  L:’..K  •:  m..:.  to  -i  r.at 

of  given  i.ia!  u , rc  h • , exp  L i •:  ih.  1 " >>r  i up  i • ; iy  conn  Lr.ed  in 

II.  , th.-a  iii«;  ii  i i • ! • Ci-rns  arc 


vi(x:r...) 

- •:!  ( .:  : n - ) + 

cl 

2 I ( • : : 

hi (x:ru  ) 

-=  2 1 : rr:  > -f 

1 ( •- : 

<) 

(9.1) 

2 L (.C  : an) 

— 2 [(<":  rur  ) + 

2 I (x, ' 

:;-:•  ) 

o 

a 

ij 

2 I x") 

21  (xf:x  + 

2 - (:,: 

a 

D a 

b 

with  a corresponding  ..ddicive 

relation  for  t 

he  as 

sociated 

freedom. 


In  teres  of  the  representation  in  (5.4)  or  (5.7)  or  Figure  8.1  as 
an  exponential  family,  for  our  discussion,  the  two  extreme  cases  are  the 
unifora  distribution  for  which  all  t's  are  zero,  and  the  observed  con- 
tingency table  or  distribution  for  which  all  N « rstu  - 1 t's  are 
needed. 


Measures  of  the  form  2I(x:x*)  , that  is,  the  comparison  of  an 

cl 

observed  contingency  table  with  an  estimated  contingency  table,  are  called 
measures  of  interaction  or  goodnass-of-f it.  Measures  of  the  form 
2I(:^:x*)  , comparing  two  estimated  contingency  tables,  am  called  mea- 
sures of  effect,  that  is  the  efface  of  the  marginals  in  the  set  but 

r.ot  in  the  set  H . ir  the  cans  in  x"  but  not  in  x . Us  note  that 
e a a 

21(:::>:rf)  tests  a nuh.  hypothesis  that  the  .nines  of  the  •:  parameters  ir. 

the  representation  oF  the  observed  conct  r.gency  t .ale  x(_)  but  not  in  the 

r enreoentation  of  the  estimate-,  cable  x"  ( <)  are  :;oro  and  the  number  of 

a 

t.tese  teas  is  the  numoer  o.  de,  trees  of  i re  adorn.  Similarly  21  (xi:?;"} 

tests  a null  nypotuesis  that  tue  values  of  the  set  of  t parame i_ers 
in  tne  representation  of  the  estimated  table  x^(a))  but  not  in 
tne  representation  of  the  estimated  table  x*(w)  are  zero  and  the 
nuwoer  of  tnese  taus  is  the  r.u..iber  of  degrees  of  freedom. 

We  summarize  tae  additive  relationships  of  the  m.d.i. 
statistics  and  the  associated  degrees  of  freedom  in  the  Analysis 
of  Information  Table  y.l. 
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TABLE  9.  1 


ANALYSIS  OF  INFORMATION  TABLE 


CompoiH  n t_  due  to  In  format  ion D.  F. 

H_  : Interaction  21(x:x*)  N 

a a a 

ub  : Effect  2I<V**>  Na-N„- 

Interaction  2I(x:x^) 


Since  measures  of  the  form  2I(x:x  ) may  also  be  interpreted  as  measures 
of  the  "variation  unexplained"  by  the  estimate  x^  , the  additive  rela- 
tionship leads  to  the  interpretation  of  the  ratio 


(9.2) 


2I(x:x^)  - 2I(x:x^)  2I(x^:xa) 


21(x:x  ) 
a 


2I(x:x  ) 
a 


as  the  percentage  of  the  unexplained  variation  due  to  x accounted  for 

* a 

by  the  additional  constraints  defining  x^  . The  ratio  (9.2)  is  thus 
similar  to  the  squared  correlation  coefficients  associated  with  normal 
distributions . 


We  remark  that  the  marginals  explicit  and  implicit  of  the  estimated 
* * 
table  x ( .;)  which  form  the  set  of  restraints  H used  to  generate  x (c<) 
a ci 

are  the  same  as  the  corresponding  marginals  of  the  observed  :<(o)  table 
and  all  lower  order  implied  marginals.  It  may  t>e  shown  that  2I(x:x“I  is 
approximately  a quadratic  in  the  differences  between  the  remaining  mar- 
ginals of  the  x(w)  table  and  the  corresponding  ones  as  calculated  from 

the  x (w)  table, 
a 

Similarly  2I(x^:x*)  is  also  approximately  a quadratic  in  the 

differences  between  those  additional  marginal  restraints  in  but  not 

£ 

in  H and  the  corresponding  marginal  values  as  computed  from  the  x (j) 
a a 

table. 


As  may  be  seen,  because  of  the  nature  of  the  T(w)  functions 
described  in  Section  7.1  or  indicated  in  Figure  8.1,  the  t's  are  deter- 
mined from  the  log-linear  regression  Equations  (6.7)  (see  (8.2)  and  (10.3)) 
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as  '•i'.,'.  .Mi'!  liil  i. . »■-  «•  o!  <il  u.:s  ul  'n  '■  ( • j!  . A variety  oi  statin- 
.'if;  1)  "‘i.  pc.-.i.'Mit'-.i  in  i ho  literature  tn-  rh-  ana* /sin  ui.  contingency 

tab  !••  . w!> . i.  a ijiti  i_c:;  i a d : t [rr.'H.'Oj  of  nsarr  iiial  viliies  or  quadratic 
m flu*  i'..  or;  tiU-  l iri.wtc  coiabia.it  iu.vs  of  .1  og  irithms  ol  the  ob  ; -rvad  or 
os  t i ci  cited  values.  Ike  principle  of  minimum  diner  imlnn.t.  i on  into  mat  ion 
estimation  end  its  p roc  Mures  thus  provides  a unifying  relationship  since 
such  statistics  nay  he  seen  as  quadratic  approximations  of.  the  niniaua 
disericiin.it.  ion  in  fo  mat  ion  statistic,  lie  renark  that  the  corresponding 

a 

app roximate  X- ' s are  not  generally  additive. 

We.  mention  the  aoproiiimations  in  terms  of  quadratic  forms  in  the 
marginals  or  the  c ' s as  a possible  bridge  connecting  the  familiar  pro- 
cedures of  classical  regression  analysis  and  the  procedures  proposed  here 
to  assist  in  understanding  and  interpreting  the  analysis  of  information 
tables.  The  covariance  matrix  of  the  T(w)  functions  or  the  taus  can 
ba  obtained  for  either  the  observed  table  or  an.y  of  the  estimated  tables, 
as  well  as  the  inverse  matrices  as  part  of  the  output  of  the  general 
computer  program.  See  (10.4)  to  (10.9). 


10  . The  2x2  Tabic 

It  may  be  useful  tc  reexamine  the  2x2  table  from  the  point  of 
view  of  the  preceding  discussion.  The  algebraic  details  are  simple  in  this 
case  and  exhibit  the  unification  of  the  information  theoretic  development. 


Suppose  we  have  the  observed  2x2  table  in  Figure  10.1 


x(U) 

x(12)  | 

j x(l.) 

*(21) 

x (22)  ! 

i x (2. ) 

x(.  1) 

x(*2)  || 

1 n 

Figure  1U.1 
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If  ui!  obtain  the  m.d.i.  estimate  fitting  the  niu-v.jy  marginals,  the 

generalized  independence  hypothesis  is  the  classical  independence  hy- 

* 

potties  is  and  the  minimum  discrimination  information  estimate,  is  x (ij)  = 
x(i  . ) x(  . j) /n  . The.  representation  of  the  log-linear  regression  (6.7)  as 
in  Figure  8.1  for  the  full  model  is  given  in  Figure  10.2.  The  entries  in 

the  columns 


i j 

i 

1 T 2 

1 1 

— 

1 

1 1 

1 2 

1 

1 

2 1 

1 

1 

2 2 

1 

Figure  10.2 


are,  respectively,  the  values  of  the  functions  T^Cij)  , T^(ij)  , T^(ij) 
associated  with  the  marginals  0^  = x(l.)  , 0^  = x( . 1)  , 0^  = x(ll)  , 
and  the  column  headed  L corresponds  to  the  normalizing  factor  (the 
negative  of  the  logarithm  of  the  moment-generating  function  as  in  (6.7)). 


We  recall  the  interpretation  of  Figure  10.2  as  the  log-linear 
relations 

/in 


f 5"  = L + T . +T  +T- 

nu  1 ? 3 


\ in  — — * L +• 


(10.1) 


n7i 


, t x(21)  _ . . 

( Ml  = L + 

/ nir 


i 


V ».n 
\ nir 


2) 


= i. 


\ 


From  (10.1)  we  find 


I.  !*  i.n  (x(22)/n/4) 


(10.2) 


^ in  (x(12) /x(22) ) , 
t2  = in  (x(21)/x(22) ) , 
i3  « ?n  (x(ll)x(22)/x(12)x(21)) 


20 


or 


(10.3) 


i j - hi  Xll2; 

- h.  x ( 2 2 ) 

> 

t ■ in  x ( 2 J ) 

- in  x ( 2 2 ) 

> 

[ ^ --  2.  ii  x ( 1 J ) 

1 in  x ( 2 2 ) 

- in  x ( 1 2 ) - 

9.  n 

the  matrix 

with  columns 

the  columns 

of 

of  Figure  10.2  t i i it:  is. 


(10.4) 


and  define  a diagonal  matrix  E)  with  main  diagonal  the  elements  x(ij)  , 
that  is, 

/x(il)  0 0 0 

D I 0 x(12)  0 0 

0 0 x(21)  0 

\ U 0 0 x(22)' 


(10.5) 


then  the  estimate  of  the  covariance  matrix  of  0^  --  x(l.)  , 9^  = x(.l)  , 


- y.(ll)  for  the  observed  contingency  table  is  T_  - A^.,  ^ where 
( 1 0 . o ) A --  ^ 


(10.7) 


/“>'  M.  TT.T 


\Xn  W 


-i 


-22.1  —22  ~ —21  —11  —12 


and  A^  is  1 x 1 , A,,^  i sJ  3x3  , ~ 


is  1x3.  It  is 


found  that 


/ x(l.)x(2.) 
f n 


- x(l.)x(.l) 

L = i x(ll) — 


x(U)  - 

x(.l)x(.2) 


x ( 1 . ) x ( . 1 ) x(ll)x(2.) 

n n 


(10. a) 
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x ( 1 1 ) x ( 2 . ) 
n 


x(i  1) x( . 2) 
n 


21 


x( 11) x( . 2) 
n 


x(  1 1)  - 


x (11) 


f 


.mJ  l he  law 


(lO.iO 


>:(  J .'.i 


<(.’2l 


x(’2) 


x(12)  x(.’2) 


JL 

x’(22> 


-J— . + „L 

x(21)  *(22) 


JL _l_ 

"x ( 21)  x ( 2 2) 


1 _ 1 

x(12)  x(22) 

1 _ 1_ 

'x(V.l)  x(22) 


J f 1_  ■ i_  , _I_  . 
x(i  1)  x(l2)V(21)  *(221  ’ 


We  remark  that  the  mat  r Lx  in(  10.9)  is  the  covariance  matrix  of  the  x Vs 
in  (10.3) . Similar  results  hold  in  general  and  £or  estimated 
taules . 

Note  that  the  value  of  the  logarithm  of  the  cross-product  ratio,  a 

measure  of  association  or  interac tion,  appears  in  the  course  of  the  analysis 

as  the  value  of  r_  for  the  observed  values  x(ij)  , and  that  t _ = 0 for 

* 3 3 

x (ij)  , the  estimate  under  the  hypothesis  of  independence,  for  which  the 
representation  as  in  Figure  10.2  'ses  not  involve  the  last  column  since  it 
is  obtained  by  fitting  the  one-way  marginals. 


The  log-linear  relations  for  the  estimate  x (ij)  are 

fin  Aiy-  - L + rx  + x2 


nr 


Jin 


(12) 


(10.10) 


at 


L + r . 


/ . **< 2 ■ ) . 
j 'n  nr  ’ ' L + '2 


>;*(22) 

;.n  ---  L 

n.i 


where  the  numerical  values  of  L , in(10.lU)  depend  on  x and 

differ  iron  the  values  in  (10.1)  . 

The  minimum  discrimination  information  statistic  to  test  the  null 

•fa 

hypothesis  or  model  of  independence  is  21(x:x  ) with  one  degree  of  free- 
dom. In  this  case  the  quadratic  approximation  is 

(io.il)  J, <,:,*)  * t ,-L  + _JL  frL)  . 

“ Vx'Ul)  v."<l?)  »"(2l)  x (21)/ 

22 


r 


* 

Remembering  that  x (ij)  x(i  .)>:(.  j )/ n » right  -hand  aide  o£  (10.11) 

may  also  be  shown  to  be. 


2 

(10.12)  x = ~ 

2 

the  classical  x -test 
test  which  has  been  p 
no  interaction  in  the 


(x(ij)  - x(i.)x(  . j)  /n)  V— ‘-V— ’J-  . 

for  independence  with  one  degree  of  freedom.  Another 
ronosed  for  the  null  hypothesis  of  no  association  or 
2x2  table  is 


(10.13)  x(H)  + x(22)  - in  x(12) 


in  x(21)) 


■( 


+ 


x(ll)  x( 12)  x(21)  x(22) 


which  may  be  shown  to  be  a quadratic  approximation  for  2l(x:x  ) in  terms 

of  t ^ with  the  covariance  matrix  estimated  using  the  observed  values  and 

not  the  estimated  values.  We.  remark  that  if  the  observed  values  are  used 

2 

to  estimate  the  covariance  matrix  tiien  insr  id  of  the  classical  \ -test  in 
(10.12)  there  is  derived  the  modified  Ncyman  chi-square 

(10.14)  Xi/=  I (x(i  j)  - x(i.)x( . j) /n)“/x(ij)  . 


11.  An  Analysis 

In  order  to  coordinate  and  relate  the  various  definitions,  concepts, 
parameters,  computational  features,  oLc.  discussed  .n  the  preceding  sec- 
tions we  shall  •. onsider  n detail  the  analysis  of  a specific  r.nnt ir^aocy 
tab le . 

Table  11.1  is  a four-way  contingency  table  at'  11 ,053  men  in  a 
training  program  , cross-classified  on  the  varies les  home  region 

, level  of  education,  race  and  program  completion.  We  denote  the 
occurrences  in  the  four-way  cross— classification  or  contingency  table  ]_2,.l 
by  x(ijki)  with  the  notation 


Variable 

Index 

1 

2 

3 

4 

Home  Region 

i 

-r  n 

East 

North 

West 

South 

Level  of  Education 

j 

Below  H.S. 

H.S. 

Above.  H.S. 

Race 

k 

White 

1 

Non-white 

Program 

?. 

Failed 

Passed  j 
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Fin  tUi  . vl.it  ! ■ \ ,•  ..  i ■ t ■ t i.i  : ,1  tli  ji.i  ■ i ( ; . : i i . j : • .ip  .iic cess 

in  t tci  ir.  i is.,  • ->  "i>  • mu  . .noble  on  t i . . .--...pi  xnatory 

v.i  i i .:i>  1 >■  s ii:"-..  1 e,;-'  ‘ • , ! eve  1 1 1 Im  it  ioi’.  , .u.!  r . i ' > ■ . To  olitalu  a 

i><-d  i-.i.i.at  i t •!>.  tv v Li.  - i i i lit  ' on  ut.  i 1 iziny,  signifi- 

cant  el  ft  els  end  iti'i  ; ioe  * we  sic  i L «• ..  inn  a sisps-nce  of  mi'i  imun 

disrriir.in.it  in  ox  . t.  i < • , i estimates  based  on  rested  sets  ol  fitted 

mar  in.'!  - . .a..;  i t ...  h successive  c..L  i.  .He  uses,  a set  of  mar;, inula 

which  tj . . ; 1 1 i , j : 1 v or  r itiy  c >ni  ains  the  sruvin  i Is  oi  the  preceding 

esti:  ite  e . ; a 1 . list  ..it.il  ones  to  determine  tin-  effort  of  the  aadi- 

tioa.il  ;iid i ■ i i . ; ; - ■ ■ r assoc j .;  - i interact  i on  I hi  parameters . The 

analysts  o:  in:  -rsl  ta  : able  p«Tni»  s u;  to  inogo  the  significance  or 

non-significance  ol  ti-.-u.e  effects  or  interaction  tau  parameters. 

11*1  Fi  U j •••.  •■te-i  Sets  nt  M.trg  i in  is  . Since  we  are  interested  in 
the  possible  re l.i*  ion  nil  i t success  in  training  on  home  region  t level 
of  education  nu.i  race . we  J i rc  t fit  tin:  marginals  x(ijk.)  , x(...£) 
since  the  cm  re:  pondin',  estimate  x ( i j v • ) - x(  ijk  .)  x(  ...£.) /n  is  that 
unde r the  null  hypothesis  or  model  ui  Independence  of  success  and  the 
joint  v.iri  ii>  !e  (h  a : region  , level  oi  education,  race)  or  no  inter- 
act ion  beL.ee.:  succes.-,  and  the  joint  variable . In  other  words  w»  first 
want  to  determine  win  ! ; t t ae  2d  col  <.  ns  of  Table.'  11 . l.tre  homogeneous 
or  r..,t  wit::  respect  i.  : he  unJer.yir  t prabahi  l : t in  . of  passing  ..r  failing, 
idle  ns  s o c i ed  . d . i • statistic  is 

i - 2 • I " 7.  .<(  i.j  ■•<'■)  o(x(  Ljki  >/•.'  ( : Jhi) ) - -160.  ail. 

with  2 1 eagre.::;  ol  ft  .-edom.  We  re.j.ct.  the  hypntuesi  s of  independence  or 
no  interact  nai.  We  th  more  shall  look  tor  «.  ;-.p)  anatory  effects. 

In  I i •.  ::  ■ ii  . I : cere  is,  g,  i . a the  complete  scuetnnt  ic  tor  Lae  log- 
linear  re  present  at  Lons . The  representation  for  the  estimate  of  joint 

A 

independence  x (ijk-'.i  x(iik.)  ( . . . J. ) /n  uses  columns  l-l  7,  21-22, 

26-31  cot  r. -.porning  t<.  .al  l the  marginals  expl  icit  and  implicit  in  the 

i k 

fitted  margin  il  cons L r a i its . We  can  also  interpret  21 (x:x  ) as  testing 

a null  hypothesis  or  ••!.••!■•  1 tunt  the  21  t.iu  parameters  in  the  representa- 

y. 

Lion  of  x hut  not  in  x are  • no,  that  is,  the  parameters  corresponding 
ii  2o,  32 --a. 

2r> 


to  co ! umn  : 


Lug-iioejr  Representation. 


A 

Tin1  v.il'ic  of  i . : ..  ) i :■  so  large  that  wo  reject  the  model  of 
joint  iiul.  ;>•  inhnc.-.  t here hirt  j.i  oc.e.j  to  fit  .1  sequence  of  nested 

marginals  l t 1 iuchi'n.  , x(  i jk.)  a v..tiou-;  combi  r.at  Lons  of  two-  and 
three-wav  marginal:,  unit  lining  sii.'o  ; with  other  variables.  We  summa- 
rize some  results  in  the  truncated  Analysis  of  Lni  urnat.  ion  Table  11.2. 
V.'e  have  not  included  all  the  intermediate  fitting  sequences  for  concise- 
ness. W<*  remark  that  although  the  measure  of  the  effect  of  additional 
marginals  or  t lu  ir  assoc  rated  parameters  may  vary  according  to  the 
sequence  in  which  they  have  been  added,  significant  effects  tend  to 
remain  signi  Meant  and  non-sign  it  Leant  effects  tend  to  stay  non- 
significant ;,o  that  the  first  overall  survey  should  determine  the 
estimates  and  interaction  parameters  which  warrant  further  investigation. 
For  example,  the  effect  of  adding  >;(..k£)  to  x(ijk.)  , x(i..H)  , 
x(.j.l)  is  given  In  Analysis  of  Information  Table  H*3as  2I(x^:xa)  “ 
1.410  with  one  degree  of  freedom,  but  the  effect  of  adding  x(..ki.)  to 
x(ijk.)  , x(ij.i)  is  given  in  Analysis  of  Information  Table  11.2  as 

2I(x*:x*)  - I. 239  with  one  degree  of  freedom.  In  neither  case  is  the 
e "1 

effect  or  the  corresponding  can  paramerer  t ^ significant. 


The  columns  of  Figure  11.1  which  occur  in  the  log-linear  repre- 
sentations of  the  estimates  retained  in  Analysis  of  Information  Table 
11 . 2are 

Mar  •.  ■ i n a 1 s Fitted  Es  Lim  ate  Columns  of  Figure  11.1 


‘f  Vi-'-- 


1-17,  21-22,  26-31 


j. 

x ( i j k . ) , . : t 1 . . 1 -v-  t 
x(ijk.),  x ( i j . > ) x" 

x(ijk.),  x ( i j . 2. ) , x(..kf.)  x* 


1-24,  26-31 
1-24,  26-37 
1-37  . 


From  the  analytic  form  of  the  log-linear  representation  or  by 
taking  differences  of  appropriate  rows  of  Figure  H.lwithin  the  columns 
used  for  tin-  estimate,  the  log-odds  of  fail  to  pass  for  each  of  the 
estimates  are  given  by  the.  respective  parametric  representations  in  (11.1) 
where  the  superscripts  relate  to  the  variables  and  the  subscripts  range 
over  the  possible  indices.  The  values  of  the  parameters  depend  of  course 
on  the  corresponding  estimate. 
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TABLE  11.2 

ANALYSIS  OF  INFORMATION  TABLE 


Component  Due  to 

Inf ormat i on 

D.F. 

x(ijk.) , x(. . .1) 

21(x:x*) 

= 160.551 

23 

a)  x(ijk.) , x(i..«.),  x(.j.£) 

21 (x* :xx) 

a 

= 138.732 

5 

2l(x:x*) 

a 

= 21.819 

18 

m)  x(ijk. ) , x( ij  . 2.) 

= 7.384 

6 

2I(x:x*> 

IQ 

= 14.435 

12 

e)  x(ijk.)  , x(i j .5.)  , x(..k£) 

2I(x* :x*) 
e nr 

1.239 

1 

2I(x:x*) 

- 13.196 

11 

2I(x:x  ) - 2l(x:x  ) 
a 

138.732 

= 0.86 

2I(x:x*) 

160.551 

* * 
2I(x:x  ) - 2I(x:xn)) 

146.116 

« 0.91 

2I(x:x*) 

160.551 

2I(x:x  ) - 2l(x:xe> 

147.355 

= 0.92 

2I(x:x*) 

160.551 

TABLE 

11.3 

ANALYSIS  OF  INFORMATION  TABLE 

Component  Due  to 

Information 

D.F. 

a)  x(xjk.)  , x(i.  . Jt)  , x ( .JM) 

21  (x 

:x*)  = 21.819 

a 

18 

f)  x(ijk. ) , x(i..Jt),  x(.j.I),  x ( . . i 

kJt)  2I(x 

* * 

f:xa)  = 1.410 

1 

* 

2I(x 

:x*)  » 20.409 

17 
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r 


X (ijkl) 

n l ^ 

X (ijk2) 


z i z if. 

1 , + T + r . 

1 J 1 jl 


(11.1) 


x (ijkl) 
f.n  — — 


i , X9  jZ  , ljZ 

t IT.,  + t . + t . . 

1 ll  jl  ljl 


in 


xe(ijkl) 

x*(ijk2) 


Z . it  jZ  kZ  ijZ 

1 lL  jl  11  ljl 


We  recall  that  parameters  with  indices  i = 4 and/or  j = 3 
and/or  k = 2 and/or  Z = 2 are  by  convention  set  equal  to  zero. 

i k 

We  remark  that  x^(ijkZ)  , determined  by  fitting  the  marginals 
x(ijk.)  , x(ij.Z)  , is  expressible  explicitly  as 


(11.2) 


x^CijkZ)  = x(ijk.)x(ij.Z)/x(ij..) 


and  is  the  estimate  under  a null  hypothesis  that  race  and  success  are 
conditionally  independent  given  home  region  and  level  of  education. 

In  Analysis  of  Information  Table  11.2  the  value  2l(x:xm)  *>  14.435  , 

12  degrees  of  freedom,  indicates  an  acceptable  fit  of  this  model.  Fur- 

"k  k 

thermore,  2L(xe:x  ) - 1.239  , one  degree  of  freedom,  implies  that  the 

additional  elfect  of  the  marginal  x(..kZ)  is  not  significant  or  that 

in  the  parametric  representation  of  the  log-odds  in  (11.1)  the  parameter 
kZ 

measuring  the  effect  of  race  on  the  dependent  variable  success  is 

k 

not  significant.  We  therefore  investigate  the  estimate  in  greater 

detail.  The  values  of  ;:*(ijkZ)  are  given  in  Table  11.4* 

lil 


* z 

In  the  expression  for  the  log-odds  under  x in(H.i)  t.  is 

i Z j Z m i 

an  overall  average,  x . . and  x-:  are  the  effects  of  home  region  an<* 

J 1 i \i 

level  of  education  on  program  completion  and  Tij]  *-s  t^e  interaction 
effect  of  home  region  x level  of  education  on  program  completion. 
The  numerical  values  of  the  tau  parameters  are  given  in  Table  11. 5.  We 
recall  that  by  convention  parameters  with  on  index  corresponding  to 
i = 4 and/or  j = 3 and/or  £.  = 2 arc  equal  to  zero. 
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A 


TABLE  11.5 


VALUES  OF  PARAMETERS 
= -4 . Ab-4  347 
= 0.728653 
t ^ ^ = 0.041549 

'•  -1.632427 
- 1.312903 

« 0.648130 


IN  LOG-ODDS 


FOR  :*  'is'  (11.1) 
m 

hn*  = ~°- Vy in  8 


“ -0.689433 

T211  = -0-602435 

T221  = -L-003045 
iil 

= 1.137932 

- 0.360697 


From  the  parametric  representation  of  the  log-odds  in  (ll.l)snd 
the  values  In  Table  11.5  one  can  determine  differences  in  the  log-odds 
associated  with  changes  in  various  categories.  Thus  the  differences  In 
the  log-odJs  (fail  to  pass)  a.s  one  changes  the  home  region  , for  fixed 
level  of  education, are  given  by 


E-N 

E-W 

E-S 

Below 

H.S. 

0.9970 

0.7287 

0.4362 

H.S. 

1.0007 

1.3110 

0.0392 

Above 

H.S. 

0.6871 

2.3611 

0.7287 

The  differences  in  the  log-odds  as  one- changes  the  level  of  education  for 
fixed  home  region  arc;  given  by 


below  H.S. -H.S. 

H.S. -Above  H.S. 

East 

1.0617 

-0.0413 

1 

North 

1.0654 

-0.3549 

West 

1.4420 

1.0088 

South 

0.6648 

0.6481 

For  easier  interpretation,  we  convert  the  log-odds  values  to  ratios 
of  the  odds  of  failure. 
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| H/ N 

F./w 

rh 

Below 

H.S.  1 

2.7 

2.1 

J .6 

H.S. 

2.7 

3.7 

1.0 

Above 

H.S.  j 

2.0 

10.6 

2.1 

Hast 

North 

West 

SouLh 


■low  H.S./ll.S. 

H.S. /Above  H.S. 

2.9 

0.96 

2.9 

0.70 

4.2 

2.7 

1.9 

1.9 

Note  that  the  odds  of  failure  in  training  of  a man  with  home 
region  Hast  and  Above  H.S.  level  of  education  are  10.6  times  the  odds 
of  a man  with  the  sane  level  of  education  but  home  region  West. 

Men  with  home  feqion  Hast  or  North  but  with  level  of  education 
H.S.  do  better  than  men  with  same  home  region  but  Above  H.S.  level 

of  education. 

* * 

We  have  also  computed  the  odds  of  failure  x (ijkl)/x  (ijk.2)  and 

m m 

listed  the  results  in  increasing  values.  The  odds  are  expressed  to  1,000, 
that  is,  5 to  1,000,  6 to  1,000,  etc. 


Home  region 

Level  of  Education 

Odds 

West 

Above  H.S. 

2 

West 

H.S. 

6 

North 

H.S. 

9 

South 

Above  H.S. 

12 

North 

Above  H.S. 

12 

South 

H.S. 

22 

East 

H.S. 

23 

East 

Above  H.S. 

24 

North 

Below  H.S. 

25 

West 

Below  H.S. 

26 

South 

Below  H.S. 

43 

E.is  t 

Below  H.S. 

67 
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Note  that  the  overall  odds 

of  fail ui 

<•  tor  tilt:  d It 

a are 

51  1 / 1 3 74  2 » 

0.022A  or  2i. 

For  ease  of  compart: 

ion  and  i n 

iference 

, we  a 1 

SO  1 is 

t.  tfie  fare  going 

results  by  hone  region 

md  level 

of  edu> 

.it  ion . 

West 

Nor  tti 

South 

East 

Above  H.S. 

2 

12 

12 

24 

11.  S. 

6 

9 

22 

23 

below  H.S. 

26 

25 

4.3 

67 

12.  Out  1 icrs 

V.'e  d?iine  outliers  as  observations  in  one  or  riore  colls  of  a con- 
tingency table  which  apparently  deviate  sign : f icant iy  from  a fitted  nodal, 
'these  outlier...  any  lend  one  to  reject  n model  which  tits  the  other 
observations.  For  example,  in  multi-dimensional  contingency  tables  in 
• hica  circa  or  age  is  ore  of  the  classifications  there  nay  occur  an  age 
effect  seen  that  a node!  ray  be  rejected  for  the  encire  table  but  a model 
•.  uhing  the  possible  age  effect,  into  account  may  lead  to  an  acceptable 
partitionin',  of  the  nodH  . 

In  other  cases  own  though  a node]  seems  to  fit,  the  outliers  con- 
tribute iiuirh  ore  that;  i •-a.-ior.dble  to  the  measure  of  deviation  between  the 
d it  t and  the  fitted  v. -.In.'S  of  the  nodal.  In  nth-.-r  words,  the  outliers 
•a  ike  up  a ! ;.  rge  percentage  ot  the  "uner.pl  lined  variation"  2I(,x:x  ) . 

A clu-’  to  posnib!  ■ outliers  is  provided  by  the  output  of  the  com- 
pact program.  in  the  computer  output  for  each  estiwate  five  entries  are 
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f 


f ' r , ■ £ . - e.  . a . 1 ! . in  i . rf  h < ■ ‘ these  i i if  !•'*(!  i i.l  .IK  l ' s 

a ■ : value  s • r . ■ '/  s.  ! • ..  it-  iui.ii'i  i o i;  t'n-  s-  i ■ n the  i'iirr,- 

■ ::  ) ‘ th.it  e.-!l  1/  .it  i !U  1 1!  !■■■■:  in  > .‘i<:  fitting 

pre.’edui.  . f'iu-  .-  tv  leih.et  inn  in  ! : i-  uegr--  ■ . ol  l reede..;  i s o.v1  tor  (inch 

o.:irti'  I v.it:'  < 0!  1 i.l  t-.K  ti.i  > c Eii'ii  :,;iy  l . l>  ire  of  i n - '’ros  t . Tin: 

he  is  far  the  obThii.b  . omput  V i • >u  aa  i interpretation  loll  1 r./.s  . L-.-.t  x 

1 a 

Jr. .'it?  tii.-  minim:  : i ; it  i: ; : ant  : on  i it  to  nn.it  iu;.  estixi.u  subject  to  oercain 

itt.irstin.il  test  taint  .. . I ■ t denote  the  minimi.-.  d isc.rlminat  ion  infor- 

t> 

mar  ion  est  imate  sub  i to  the  same  marginal  restraints  as  x except 

a 

that  the  v.i'ue  x t ) . ;av,  is  not  included,  so  that  (u, ) --  x(to.)  . 

i h J 1 

The  basis  au  ! i t iv  i t y pr  ijierty  ot  the  minimun  discrimination  intomation 
statistics  states  tea’ 

21  (.ex”)  - 21  (x* : x ) + 2I(x::0 
a b a b 

or 

2,(x.d)  - 2 l(x:x^)  - 2«v**)  . 

These  results  are  ss  ,:.:.i  rizc-d  in  the  Analysis  of  Information  Table  12.1 


TABU;  12.1 

ANALYSIS  OF  id! OKHATION  TAiiLE 
Component,  due  to  Information 


D.F. 


H : 
a 


2 1 ( >'■ : '<.  ) 

cl 


!!b  : 


;'are  — . U . but  omitting  x( .. , ) 2t(:<^:xx) 

?K -■%>  \ 


>;  - i 

a 


K-:r 


(12.1) 


W 4 •,  . . 

...  . ) ..  i ..  V ( :•)  • n 

* o . '■  , b • , s 

\ >:  i ) *•--••  . x (.uj 

\ a L 1 a 


= ■<  * *£<“> 
) ■ n • 1 <.  ('■*.)  fn  

x"(  .)  F .i | 
a 1 i a 


an  • . .i 


• or.  V . • t / p . ; * - • r ■ \ v t • . h 1 i Lh 
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The  last  value  can  be  computed  and  is  listed  ns  the  OUTLIER  entry  for  each 

* 

cell  01  the  computer  output  for  the  estimate  x . We  remark  that  a 

separate  outlier  computation  for  each  cell  is  time  consuming. 

The  ratio 


(12.4) 


21 (x:x*)  - 21 (x:xb) 

2I(x:xJ 

a 


* * 

21  (x.  : x ) 
o a 

21  (x  :>:*) 


then  indicates  the  percentage  of  the  "unexplained  variation"  due  to  the 
outlier  /alue. 


We  shall  Illustrate  the  outlier  procedures  and  analysis  using 
data  originally  given  by  H.F.  Dorn  (The  relationship  of  cancer  of 
trie  lung  ana  the  use  ot  tobacco.  American  Statistician,  8 (1954)  , 
7-13)  and  analyzed  uy  J.  Cornfield  (A  statistical  problem  arising 
from  retrospective  studies,  Proc.  3rd  Berkeley  Symposium  4(1956), 
135-1 4 B)  . 
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Cox  ( l‘JV0)  ''oiK'j.iiur;;  a model  in  wii  i si,  I in,-  - , i s t.  i e 
ui ; l Di  :k:l'  La  thu  sumo  tor  K independent  .2x2  c-erir  '.nyency 
Lab  lea . He  defines  a residual  which  should  benave  approximately 
L i ke  the  i as  i duals  tor  <_i  random  sampl  o 1 roai  the  unit  normal 
distribution . He  illustrates  nis  graphical  analysis  with  the 
data  originally  given  bv  born  and  analyzed  by  Cornfield  ns 
non Lionod  above . 


In  Table  12. lare  listed  the  observations  from  14 
retrospective  studies  on  the  possible  association  between 
smoking  and  lung  cancer.  We  denote  the  occurrences  in 
the  three-way  14x2x2  contingency  table  by  x(ijk)  with 
the  notation 


Variable 
S curly 

rati unto 
Smoking 


Index  i 1 

i 

i i No . 1 1 

j j Control  i 
1:  j Nonsmoker  j 


? 3 

N o . 2 r i o . 

Lung  cancer 
Smoker 


1.4 


No . 14 
i 


Do’3  this  data  show  association  between  smoking  and  lung 
cancer,  and  if  so,  is  the  association  homogeneous  over 
the  11  studios.-*  Here  the  measure  of  assoc  Lotion  is  the 


logarithm  of  the  cross-product  ratio. 
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T.:ble  J2..1 


!■'  i it  A ron/r.c  f- i v .■ 

A.  ; " -c  1 a i on  f *n  . 

/\r.'l  1 nil''  0 ‘i-  ■ r 

Co  ; t.ro  1 )'  * t.i.Cii  Ls 


1 ■!  o;  : --  1; . .■  j!  . Srno  k o r s 


I 

* • v ' > » - - 1.  o 

1 

1 1 
t 

| 14 

! 

72 

3 

l 

! 

| ■*  3 

227 

J 

3 

1 1 M 

i 

i 

; a i 

7 

r 

I 

J 2 5 

( 

! 397 

i 7 

5 

! 131 

I 29  9 

i 

3 2 

6 

! 114 

! 66  6 

8 

7 

12 

' 17  4 

E 

, 1 

8 

; 61 

129  6 

i 

7 

I 

9 

• 27 

1 

i 

106 

i 1 

j 

10 

; 8.1 

1 | 
5 3 4 

i 

: 

| -in 

; 1 ° ( 

11 

j 54 

! 2-:i 

i 

I 

! ' ; 

12 

56 

j 4 62  ! 

1 

| - 5 i 

13 

6 3 6 

1729  ! 

« 

! ( 1 

14 

21 

2,9  | 

1 

i ' i 

1 :i  t ients 
Snokers 


83 

90 

129 

70 

412 

597 

S8 

1350 

60 


4 59 


724 

4*  9 n 


4 51 


260 
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I'n*:  hypothesis  <->1  eon.cUt  isiiul  indop*  nde.ico  given  the 

study 

(12.  b)  H : E-(iJJl)  - ELUJA  EiiJil 

p ( x • * ) p ( i • • ) P ( i • * ) 

imposes  the  restraints  on  the  estimate  x*(ijk)  that 

til 

(12.6)  x*(ij-)  = x ( i j • ) and  x*(i’k)  = x(i*k) 

cl  cl 

In  fact  x*(ijk)  may  be  explicitly  represented  by 

cl 

(12.7)  x* (i jk)  = x ( i j • ) x (i *k) /x ( i • * ) 

cl 

Similarly  the  Hypothesis  of  no  second-order  interaction 

(12.8)  H2  : p(ijk)  - a ( i j ) b ( ik)  c ( jk) 

imposes  the  restraints  on  the  estimate  x*(ijk)  that 

(12.9)  x*(ij')  = >:(ij*)»  x*(i*k)  ~ x(i*k),  x*(*jk)  = x(«jk)  . 

The  estimate  x*(ijk)  cannot  be  represented  as  an  explicit 
function  of  the  observed  marginals. 


In  this  study,  the  min'  me..*  discrimination  inrorm.ation 
statistics  are  1 oj-  1 iJ-.ol  j hood  ratio  c’.<  -sinires  ami  l ne 
assoc  i.atoci  Analys  i.s  or  In',  orm.it.ion  tab!,  permits  us  to 

test  the  cfoo Jne  ;s-of- i i t oi  tha  ost  i is  it  ■ , si"  e tr.s  '■  feet 

38 


i> : i ■ ii  ' ■ i ii  ! rout  t a inf 

r • : ; ! ’ ) :•  ...  >:(>;.)  ' i .!.)  , 

t i ; ci  >v.  'j;i  a - ' ! i ( ■ ; '< a a:  ■ to  i 

* ....  > O'. ' L . 1 1 c d v . ; a 


t 1*  marginal 

M ounce  of 
l.’Hij 


(12.10) 


)k 
! 1 


x*  ( 111)  a 2 ( ) 22) 

xuTr2)vfrmr 

z /. 


l 4 . 


:> : ica.l]  . h o'  the-  log-odds  '->5  con  fro!  to  lung  cancer 

for  ■ iv-  oa  ;.ir'v ! c j x*  and  x*  are  g Ivon  by  the  Log-linear 

a 2 

r i -ore- ; an  c.a  1 Lon 


in 


xMi  Ik) 

cl 

xf  (i2k) 


d J 
i 1 


(12.11) 


in 


(i 

>^TT2id 


Mj 

Ml 


-f 


T 


jk 
] 1 


v;aor 


Lfli 


value*;  of  the*  t.au  pa  rone  cor  a depend  of  course 


on 


2- 


Tab1..'  12.2 

a::  Uy:  ia  oi  I nfori-.’utio.i 

ConpOii  -n'c  duo  tec  In: ov.ua l ion  D.F. 


Jl  (,i  j . } , x ( i .1.) 

21  ( 

) ^549.74 
a 

) I 

if,  : : ( J.  j - ) , (1  4i)  , 

M 2 i ( 

1 

2 1 ( 



L 3 
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The  value  oi  21  (x :x*)  when  ro.'.pa  r-  , * ‘ • t r. 

[jercenti  1«  ol  a >,t  j d i s 1. 1 i but  ion  s u :j  1 • • t ■ . :n.i*.  t ho  null 
nypot nesis  ol  no  association  between  a:  km  ; uni  ] uny 

cancer  conditioned  on  the  study  is  false.  Tins  conditional 
Hypothesis  allows  the  accumulation  of  in  format  1 on  from 
different  studies  without  imposing  trie  requirement  that 
tne  population  characteristics  of  each  study  be  similar. 

The  rejection  of  this  conditional  independence  hypothesis 
is  of  course  expected.  The  degree  of  departure  from 
independence  is  functionally  dependent  on  the  study.  Is 
this  dependence  the  result  of  a small  subset  of  the  studies 
which  are  subs  tan 1 1 a liy  different  from  the  remainder,  or 
does  tiie  departure  vary  along  a continuum? 

The  value  of  zi(x*:x*)  suggests  that  in  accordance 

i K 

with  (12.10)  the  value  of  r|^=1.687  is  significantly 
different  from  zero.  Moreover,  2I(x:x*)  is  also  significant 
when  compared  to  the  «-th  percentile  of  a distribution. 

Tne  value  of  2I(x:x*)  suggests  that  we  reject  the  null 
hypothesis  of  no  second-orcier  interaction,  that  is,  the 
mo  lei  with  a common  value  of  the  interaction  parameter 
is  not  a good  fit.  The  values  of  x^  are  giver,  in  Table  12.3. 
We  now  proceed  to  determine  the  outliers. 


AO 


Vai  Lc  12.; 
x;  (i  ]!•) 


5 t u a 2' 

Control  P 

a tier',  t.  j 

:.:rv;  Cancor 

Pa  tients 

I.on-Si.oko  rs 

Srno';ers 

Nt okors 

Smokers 

'• 

.1 4 . 0 1 

71,99 

5.99  j 

83.01 

2 ! 

4 2 . S 6 

227.14  . 

3.14  i 

89.86 

3 1 

j 19.99 

, 80.11  ; 

6.01 

129.99 

1 

4 j 

: 

132.16 

1 

i 389.83  i 

4.84 

l 

77.16 

5 I 

i 

! 130.03 

i 

300.00  ; 

32.97 

410.99 

- i 

b 

103.06 

; 674.94  ! 

[ 16.94  i 

i 

588.06 

7 

15.47 

170.53 

! t 

; 

| 1.54 

91.47 

8 

''  57.06 

1 

i 1299.93 

I 

1 | 
10.94 

1346.08 

9 ! 27.15  1 

\ i 

1 ] 
| 105.85 

2.85 

i 

60.15 

10 

^ 85.21  ! 

i 1 

| 529.79  j 

! 

13.79 

463.21 

l .1 

\ 38.62 

261.39 

19.38 

708.62 

12 

62.53 

1 A -r  Tt  ' 

1 4_o.77 

12.77 

505.23 

12 

6 4 J . 3 2 

| 1721.66 

31.69 

458.33 

i 4 

27 . 3 4 

259.16 

5.  17 

259.84 
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Examination  of  the  computer  output  ror  x*  usinq 
all  14  studies  showed  a largest  OUTLIER  value  of  18.14 
for  the  cell  (11,2,1).  A new  estimate  fitting  the  mar- 
ginals x ( i j . ) , x(i.k),  x(.jk)  and  omitting  the  cell 
(11,2,1)  was  obtained.  In  fact  Study  11  was  omitted 
because  with  the  constraints  for  the  new  estimate 
x*(ll,j,k)  = x(ll,j,k).  Since  this  estimate  yielded 


(12.12)  21  (x:x*)  = 28.40,  12d.  f. 


the  deletion  procedure  was  continued.  We  summarize  the 
results  in  Tabic  12 • 4 and  Analysis  of  Information  Table  12.5 


Table  12.4 


Fitting  x(  L).),  x(i.k),  x(.jk)  with  sequential 
deletion  of  studies 


st 

ti  r ■ 

• iio 

• b 

1- 

1- 

! 0 

,12- 

14 

1- 

5 , 

7-10 

,12-1.4 

1- 

3, 

5 7 - 

10 , 1.7-3  4 

]- 

3, 

/ ' ' ~ 

LJ , 1 7 -14 

Largest  OUTLIER 
Cell  Value  • 

' (11,2,1)  18.11  j 

i • 

i (6,2,1)  7.89 

’ (4,2,1)  4.G7 

(7,2,1)  3.91 


I.-formation  . D.F. 


2 T ( : 

>:') 

-55.19  ; 

13 

21  (x: 

23.40  'j 

12 

2 a (:•:: 

c 

--=18.03 

11 

2'  (::: 

:-d} 

11.94 

10 

2 I i : : 

.::*) 

--  7.0  3 

9 
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Table  12.5 


Analysis  oL  Information 

CoruiJon-,'ii  t due  to  In  forma  L icm  D.B‘. 


All 

14  studies 

21 (x : X“  ) ~5  5. 19 

13 

Less 

11 

21  "-20.70 

1 

21 (x:x*) =28. 40 

12 

Less 

11,6 

21 (x* :x*) =10.37 

C D 

1 

21  (x : x* ) =18 . 03 
c 

11 

Less 

11,6,4 

21 (x* :x*) = 6.08 

d c 

1 

21  (x:x*) =11. 94 

10 

Less 

11,6,4,7 

21  (::*  : x*  ) = 4.92 

1 

21  (x : x* ) = 7.03 

e 

9 

Since  (21 (x:x*)  - 21  (x : x*  ) ) /2I  (x  : x*  ) 

= 21  (>:*  :x*)/2I  (x:x*)  = 48.16/55.19=0.37  we  see  that  the 
four  studies  nurnbered  4,6,7,11  contributed  87%  of  the 
"unexplained  variation"  2I(x:x*)  . The  values  of  the 
estimate  >:*  are  given  in  Table  12 The  value  of  the 
log  cross-product  ratio  is 


(12.13) 


(ill)  x* ( i 2 2 ) 
x*(il2)  x*  ( i 2 1 ) 


1.55,  i= 1-3, 5, 8-10, 12-14  . 
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Table; 


xM  i ]):) 

Coiiirol  t.icn  t'.o I I,ung  Cancer  I'atienr.s 


) 


Study 

•Jon-  r 

" ok or s 

taok* 

2rs 

f ion-cirrokers 

Srr.ok- 

jrs 

1 

1 3 

9 

7 7 

32 

3. 

32 

82 

69 

2 

2 

'1  6 

227 

53 

3. 

54 

89 

47 

3 

19 

40 

80 

60 

6. 

CO 

129 

40 

5 

126 

85 

303 

18 

36. 

407 

83 

8 

55 

79 

1301 

21 

12. 

21 

1344 

80 

9 

26 

80 

106. 

20 

3. 

20 

59 

80 

10 

8 3 

61 

531 

39 

15. 

39 

461. 

62 

12 

60 

81 

4 57 

20 

14. 

19 

503. 

81 

13 

639 

8 5 

1725 

64 

35. 

66 

45^. 

36 

1 4 

27. 

2 4 

259. 

76 

5. 

76 

2 59. 

24 
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We  note  that  Cox  (1970)  in  analyzing  the  data  of 
Table  12 . Iconclucled  that  studies  8,6,  and  11  were  outliers. 


For  the  14  studies  he  found  a residual  sura  of  squares  47.7 

with  13  degrees  of  freedom.  With  studies  3,  6,  and  11 

omitted  he  found  a residual  sum  of  squares  15.1  with  10 

degrees  of  freedom.  (Cox  (1970)  p.  S3  gives  the  degrees 

of  freedom  as  11,  a misprint) . 

Following  the  procedure  described  when  Studies  6, 

8,  and  11  were  omitted  the  results  led  to  the  Analysis  of 

Information  Table  12.7.  Note  that  omitting  Studies  11,  6, 

4 as  per  Table  12.5  accounts  for  more  of  the  unexplained 

variation . Table  12.7 

Analysis  of  Information 

Component  due  to  Information  D.F. 

13 


All  1-1  studio: 


21 (x:x*)=55.19 


ss  6,8,11 


2 1 ( x*  : x*  ) = 4 1 . 6 2 

21 (x:x|) =13.57 


3 

10 


The  sequential  procedure  discussed  herein  was  also 
applied  to  data  relating  father  and  son  professions 
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pubLiuhou  by  Kar  l Pearson  (1904)  , "on  the  theory  of 

contingency  ana  its  relation  to  association  and  normal 

correlation,"  reprinted  in  Karl  Pearson's  Karly  Papers, 

Cambridge  University  Press,  1948,  and  considered  by 

Fienberg  (19fa9)  and  Good  (1956).  Using  the  Pearson  data 

2 

Fienoerg  obtained  an  X = 184.9  with  44  out  of  196  cells 

2 

deleted  whereas  the  sequential  procedure  led  to  an  X = 155.3 
with  25  cells  deleted. 

13 . Zero  Marginals 

As  may  be  noted  from  the  analysis  in  Section  11,  zero 
occurrences  in  cells  of  the  observed  contingency  table 
present  no  special  problem  provided  that  no  marginal  entering 
into  the  fitting  specification  is  zero.  When  the  latter  is 
the  case,  however,  the  interpretation  may  be  distorted 
because  of  inflated  degrees  of  freedom.  A procedure  to 
circumvent  this  problem  is  similar  to  that  used  for  getting 
revised  estimates  when  outliers  are  indicated.  We  shall 
present  the  procedure  in  terms  of  a specific  example. 

The  following  data  resulted  from  a study  of  Christmas 
tree  consumption.  We  are  indebted  to  Dipl.  Forstwirt  Dietrich 
V.  Staden,  Institut  f.  Fors tbenutzung , Universitaet  Goettingen 
for  the  data  and  permission  to  use  it.  In  Table  13.1  are 
listed  responses  to  the  question  "Did  you  have  a Christmas 
tree  in  your  apartment/house  last  year  or  not?"  according  to 
size  of  household  and  size  of  city.  We  denote  the  occurrences 
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in  the  throe-way  1x9x5  contingency  table  by  x ( i ik)  with  th< 
notation 


Variable  . 

Index  | 

1 

2 

3 

r 

‘ 

Tree 

i i ! 

Yes 

1 

No 

1 

Household  size 

i 1 

i ! 

- 

2 i 

f 

3 i 

City  size 

k 

<2000 

2000 

to 

20000 

20000 

Lo 

100000 

4 


4 

100000 

to 

500000 


S 6 


5 

500000 

or 

rooro 


b 


For  a 2x9x5  KxCxl)  contingency  table  we  compute  <m 
estimate  under  a hypothesis  of  no  second-order  interaction 
by  fitting  ail  the  two-way  marginals.  Call  this  estimate 
x|(ijk).  A test  for  the  null  hypothesis  of  no  second-order 
interaction  is  given  by 


2tu:  X(ijk)  ... 


2i (x:x£) 


32  u.F. 


If  there  is  no  second-order  interaction  then  the  associations 
between  K and  C,  K and  U,  C and  D are  the  same  for  all  values 
of  the  third  variable,  that  is, 

:<*  ( 1 jk)  x*  ( 29k) 

'n  x*-(73kRf nw  d°Pends  only  0,1  j ' 

x* (1 jk) x* (2  j 5) 

*n  xj(2jkTx*(lj5)  d°P^ds  on  k' 

x*  (ijk)  x*  (i‘15) 

*n  x*~(Tj  57~x*'(  iw  is  independent  of  i. 
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U’lttlin  tnis  modi- 1 a t est  nhiM  !;ei  ■ i\'  ' i : ' . ' { t ' i\)  con- 

tributes s iqm  t leant l y is  obtained  t ■>  . o.  , -ui  . i i .«:>  e t i ma t c 
tit  tine  the  marqmals  ,\  ( i i • ) , x ( • i k ) >■  i l . aU  tin., 
estimate  x*(ijk)  , wtilch  can  be  i a;,  x*(i  jk) 

x ( .1  j • ) x ( • jk)  / x ( • i * ) . We  triv.jai.o  x*(t|i.)  as  the  estimate 
imai't  an  by  potties  is  of  conditional  independence  of  H ami  l> 
qiven  C.  We  now  nave  Analysis  ot  Entonuat  ion  Table  l 1..!. 


Tab  l.o  13.2 


Component  due  to 


Information 


Conditional  independence 

of  K and  D qiven  C 2l(x;x*) 


Ell  feet  of  x(i‘k)  qiven 
x ( i j • ) and  x ( • j k J 2L(x*:::*) 

No  second-order 

interaction  2E(x:x*) 


i ).}■'. 


3b 

4 


For  the  particular  data  in  question  however,  because 
x(ijk)-t)  for  j -o,7, 8, 9,  i-2  and  also  Lor  some  ot  i*L, 
j = 7,  H , y , the  estimates  for  tile  entries  eoi  i ospond  inq  to 
x*(ijk)  for  b'<th  lor  x^  and  wiJL  not  diti"f 

from  the  observed  value.  Aecordinqly  lot  ns  compute  an 
estimate  xMijk)  whicii  is  obtained  by  fittinq  the  two-way 
inarqinals  oi  the  2xSx5  table  j - 1. , 2 , 3 , 4 , 0 and  >:*  ( L jk)  ~x(  i jk)  , 
j-0,7,B,y.  himiiuily  let.  x*  ( i jk)  -x  ( i j •)  x {-  j k )/;•:(•  j • ) lor 
the  2>:'jxS  table  j-.L , 2 , 3 , 4 , 3 and  x*  ( i jk)  ■:<  ( i jk)  , j-6,7,iS,‘t. 
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We  now  f ind 


Table  13.3 


Component  due  to  Information  D.F. 

Conditional  independence 

of  R and  D given  C 21 ( x :x*) =25 . 532  20 

3 e 

Effect  of  x(i*k)  given 

x(xj*)  and  x(*jk)  21 (x* :x*) =5 . 821  4 

No  second-order 

interaction  21 (x:x*) =19 . 711  16 


Note  the  reduction  in  the  degrees  of  freedom  between  Table  13.2 
and  Table  13.3.  it  is  also  interesting  to  note  that  when 
actually  carrying  out  the  procedures  for  Table  13.2,  the  same 
estimates  and  statistics  were  obtained  as  for  Table  13.3.  See 
Table  13.4  and  13.5,  Table  13.6  and  13.7. 

It  seems  reasonable  to  conclude  that  the  purchase  of  a 
Christmas  tree  is  independent  of  the  size  of  the  city  given 
the  size  of  the  household  ( j=l ,2 , J , 4 , 8)  and  households  of 
size  6, 7, 8, 9 seem  almost  sure  to  buy  Christmas  trees. 

The  log-odds  for  the  purchase  ot  a Christmas  tree  as  a 
function  of  household  size  is  given  in  Table  13.8.  The 
probability  estimate  for  a purchase  as  a i unction  of  household 
size  is  given  in  Table  13.9. 

Table  13.8 

In  (x*  (ljk)  /x*  (2  jk))  = In  (x(l j • ) /x(2  j • )) 

j=l  -0.2586 

2 0.8662 

3 2.1702 

4 3.4012 

5 2.3716 
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1 3 . 9 


x ( 1 j * ) />: ; • j • ; 

j--i  61/140  - U . 4 3 .» 7 

2 214/ . j 0 4 ‘ U VO  39 

3 219/244  - 0.8979 

4 180/186  0.9677 

5 79/82  = 0.9146 


For  more  comp  Lex  situations  there  is  also  the  Log-linear 
analysis,  which  is  of  course  available  for  this  problem  too, 
but  it  would  not  add  anything  to  the  analysis  o:  this 
particular  data. 
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of  multivai'iate  (multiple  variates)  analysis  with  particular  application 
to  qualitative  or  categorical  as  well  as  quantitative  variables.  ' 

'The  analysis  is  concerned  with  counts  in  multiway  cross-classifications 
or  multiway  contingency  tables.  Multiway  contingency  tables,  or  cross- 
classifications of  ve’tjrs  of  discrete  random  variables,  provide  a useful 
approach  to  the  analysis  of  multivariate  discrete  data.  . 

The  method  of  analysis  presented  will  bring  out  the  various  inter- 
relationships among  the  classif icatory  variables  in  a multiway  cross- 
classification or  contingency  table  in  many  dimensions. 

^ The  procedure  is  based  on  the  Principle  of  Minimum  Discrimination 
Information  Estimation,  associated  statistics  and  Analysis  of  Informa- 
tion. General  computer  programs  are  available  to  provide  the  necessary 
results  for  inference.  An  analysis  of  a four -way  contingency  table  is 
presented  for  illustration  of  these  techniques. 
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